60 - Exercise_03 [ID:40160]
50 von 315 angezeigt

Hi, we're going to discuss exercise sheet three today.

Exercise number six will be about proving how we can alternatively characterize taken

effect reconstruction by solving a quadratic optimization problem and solving the linear

normal equation respectively.

So the generalized Tichonov regularized solution is the argument of the following quadratic

functional one half au minus f squared plus lambda half lu squared.

I'm going to write this as f of u and it's easy to see that f of u can also be written

as one half u transpose a transpose au plus lambda half u transpose l transpose lu minus

f, no that's why it's so differently, u transpose a transpose f plus one half f transpose f.

And the minimizer of a quadratic functional is given by the unique, well unique if the

matrices are spd, so by the root of the gradient of this quadratic function.

So we just have to compute the gradient of f in u and by taking the gradient of this

quadratic expression, so this is something that you have probably learned in some calculus

class this is a transpose au plus lambda l transpose lu minus a transpose f.

Now you see that the gradient of f in u is equal to zero if and only if this linear equation

holds which means that a transpose a plus lambda l transpose l as a whole applied to

a is equal to a transpose f.

That is exactly the normal equation for generalize Tiharov.

Exercise seven.

This is about the gradient descent algorithm, so the gradient descent algorithm has this

form xn plus one is equal to xn minus some step size alpha n times the gradient of a

functional in xn. This leads to the minimizer of phi of x at least in a lot of steps, at

least to a local minimum.

And what if phi of x now has a specific form one half x transpose bx minus b transpose

x. So in particular this could be such a quadratic like this so a generalized Tiharov functional

is a specific form of that we just have to put b equal to a transpose a plus lambda l

transpose l and lowercase b equal to a transpose f and that would exactly lead to the same

form.

And for this quadratic we would have the gradient of phi is equal to bx minus b.

So this means we know what the we also call this the residual R.

And so the specific form of gradient descent for quadratic functionals is something that

we discussed in quite some detail a few lectures ago and the specific form here was that we

take xn we subtract of course Rn which is the sorry plus Rn, Rn is the negative gradient

and the step size is then Rn transpose Rn divided by Rn transpose b times Rn.

This is the correct step size to choose.

And the goal of this exercise here is to prove that if R0 so the initial residual, the residual

at the first iterate and that can be written as b minus bx0 if that is an eigenvector of

b which means that b R0 is lambda R0 for some lambda positive then well lambda could be

zero but we won't discuss this case.

If that is the case then x1 which is the first iterate will be equal to the minimizer of

the function phi.

So instead of doing this zigzag thing which you typically see for gradient descent starting

with the right initialization will give you a gradient descent algorithm that converges

in just one step which is ideal of course.

Now how can we prove this?

The first step is to show that what can be what do we know here so we know that R0 is

the same as minus b E0 and this particularly implies that b minus 1 R0 is equal to minus

E0.

Now because of this relationship implies that R0 is equal to lambda zero b minus 1 R0 this

means that also b sorry that R0 is also an eigenvector with respect to b minus 1.

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:43:39 Min

Aufnahmedatum

2022-01-18

Hochgeladen am

2022-01-18 15:36:04

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen